The pancancer overexpressed NFYC Antisense 1 controls cell cycle mitotic progression through in cis and in trans modes of action

Antisense RNAs (asRNAs) represent an underappreciated yet crucial layer of gene expression regulation. Generally thought to modulate their sense genes in cis through sequence complementarity or their act of transcription, asRNAs can also regulate different molecular targets in trans, in the nucleus or in the cytoplasm. Here, we performed an in-depth molecular characterization of NFYC Antisense 1 (NFYC-AS1), the asRNA transcribed head-to-head to NFYC subunit of the proliferation-associated NF-Y transcription factor. Our results show that NFYC-AS1 is a prevalently nuclear asRNA peaking early in the cell cycle. Comparative genomics suggests a narrow phylogenetic distribution, with a probable origin in the common ancestor of mammalian lineages. NFYC-AS1 is overexpressed pancancer, preferentially in association with RB1 mutations. Knockdown of NFYC-AS1 by antisense oligonucleotides impairs cell growth in lung squamous cell carcinoma and small cell lung cancer cells, a phenotype recapitulated by CRISPR/Cas9-deletion of its transcription start site. Surprisingly, expression of the sense gene is affected only when endogenous transcription of NFYC-AS1 is manipulated. This suggests that regulation of cell proliferation is at least in part independent of the in cis transcription-mediated effect on NFYC and is possibly exerted by RNA-dependent in trans effects converging on the regulation of G2/M cell cycle phase genes. Accordingly, NFYC-AS1-depleted cells are stuck in mitosis, indicating defects in mitotic progression. Overall, NFYC-AS1 emerged as a cell cycle-regulating asRNA with dual action, holding therapeutic potential in different cancer types, including the very aggressive RB1-mutated tumors.

modifications (ENCODE) are reported.B Violin plots of NFYC-AS1 and C NFYC expression across GTEx human tissues (Data Source: GTEx Analysis Release V8 (1)).Boxes show median and interquartile ranges.Supplementary Figure S3: NFYC-AS1 transcript reannotation.A NFYC(-AS1) locus at chr1p34.2 as from UCSC Genome Browser (GRCh38/hg38 assembly).On the top, the design of qRT-PCR primers and Gapmers is shown together with the NFYC-AS1 reconstructed annotation.On the bottom, the design of the primers and the sequencing output is reported for both 5'RACE and 3'RACE assays.B Bar plot showing the ratio of NFYC-AS1 expression as measured from cDNA retrotranscribed using random oligo-dT primers versus random primers.qRT-PCR was conducted using different primers (indicated on top of the bars) spanning the whole NFYC-AS1 sequence.Data are reported as mean ± sd as from n=3 independent biological replicates.C Schematic workflow for the study of the noncoding potential of NFYC-AS1 using ORF Finder (4), CPAT (5), CNIT (6), LGC (7), and POSTAR (8) tools.A Relation between the estimated accessibility of each Gapmer-targeted region calculated as the average of the accessibility percentage of the single bases belonging to the NFYC-AS1 portions targeted by the different Gapmers (GAP1-GAP6) (as from S-fold( 9)) and the NFYC-AS1 knockdown level reached with the different Gapmers in H520 RB1 wt cells.The NFYC-AS1 accessibility analysis was performed using our reconstructed annotation to estimate the accessibility of each Gapmertargeted region for GAP2-GAP6 and the NCBI RefSeq annotation for GAP1.B NFYC-AS1 predicted secondary structure (as from RNAfold (10)); the different Gapmer-targeted regions are highlighted in the structure.Our reconstructed annotation was used for this analysis.C Bar plot showing the total cell count after NFYC-AS1 knockdown with 5 nM GAP1 used at 96h normalized against NEG in H520 RB1 wt cells.Data are reported as mean ± sd, as from n=3 independent biological replicates.p-values of the two-tailed unpaired t-test are reported.D Bar plot showing NFYC-AS1 expression measured through qRT-PCR using primer 4 (Pr4) or primer 6 (Pr6) at 48 hours after transfection with all six Gapmers (GAP1-GAP6), normalized against NEG in H82 RB1 mut cells.Gapmers were used at a final concentration of 5 nM.Data are reported as mean ± sd as from multiple independent biological replicates (n indicated in Supplementary Table S11).

Analysis of NFYC-AS1 expression in publicly available datasets
NFYC-AS1 expression in normal tissues was studied by the investigation of the Genotype-Tissue Expression (GTEx) portal (1).NFYC-AS1 expression in tumor tissues was analyzed in The Cancer Genome Atlas (TCGA) (2), considering only tissues with at least 10 normal and 10 tumor samples.Normalized TCGA data were downloaded from firebrowse webpage (http://firebrowse.org/).As of November 2020, RNAsequencing data were available for 576 LUAD samples, including 517 primary tumor samples and 59 non-tumor (normal) lung tissues, and for 552 LUSC samples, including 501 primary tumor samples and 51 normal lung tissues.Metadata were also downloaded with all the information related to the molecular subtype and the mutational status of each sample.The gene expression level was expressed as logarithm in base 2 of the normalized counts (norm) plus one [log2(norm+1)].The tumor-normal fold-change (FCT/N) was calculated as the ratio between the average normalized counts for NFYC-AS1 in tumors and in normal tissues.Instead, the NFYC-AS1/NFYC ratio was calculated as the ratio between the normalized count for NFYC-AS1 and for NFYC in each sample and then averaged.FASTQ files of the NSCLC dataset (GSE81089), SCLC dataset (GSE60052), and reprogrammed epithelial cells to neuroendocrine small cell (GSE118206) were retrieved using the fastq-dump utility of the SRA toolkit 2.9.4 version (https://sra-explorer.info/).Gene intensities were computed using RSEM-1.3.1 software using NCBI RefSeq (GRCh37/hg19) as a reference to get the transcript per million (TPM) of genes.The gene expression level was expressed as logarithm in base 2 of the TPM plus one [log2(TPM+1)].The tumor-normal fold-change (FCT/N) was calculated as the ratio between the average TPM for NFYC-AS1 in tumors and in normal tissues.The NSCLC dataset collects RNAsequencing data of 19 normal samples and 198 NSCLC samples, which were classified by performing Pearson correlation of each sample with a previously validated 42-genes-classifier (15).Samples with correlation values ranging from -0.17 to 0.17 were not further classified and were not considered, while values above 0.17 were predicted to be LUAD and values below -0.17 were predicted to be LUSC.This allowed the identification of 91 LUAD and 77 LUSC samples.Instead, the SCLC dataset contains 79 SCLC tumors and 7 normal tissues.Publicly available data for 1305 cell lines derived from different tumors retrieved from the Cancer Cell Line Encyclopedia (CCLE) (3) were downloaded from Depmap portal (version 20Q4, https://depmap.org/portal/).Metadata were also downloaded with all the information related to the histotype of origin and the mutational status for each cell line.The gene expression level was expressed as logarithm in base 2 of the TPM plus one [log2(TPM+1)].The mutated-wildtype foldchange (FCmut/wt) was calculated as the ratio between the average TPM for NFYC-AS1 in presence and absence of the mutations.Instead, the NFYC-AS1/NFYC was calculated as the ratio between the normalized count for NFYC-AS1 and for NFYC.SCLC cell lines were classified into NEUROD1, ASCL1 and YAP1 molecular subtypes according to Ireland et al. (16).
Receiver Operating Characteristic (ROC) curves were computed using the pROC R package to test the performance of NFYC-AS1 expression as a binary classifier of normal and tumor tissues in different cancer types, NSCLC, LUAD, and LUSC (TCGA) and of RB1-wildtype and RB1-mutated LUAD and LUSC tissues (TCGA), and lung cancer cell lines (CCLE).The area under the curve (AUC) is computed using the roc function from the pROC R package.For a good binary classifier an AUC > 0.5 is expected.

Nucleus-cytoplasm fractionation
For nucleus-cytoplasm fractionation, 3 x 10 6 H520 and H82 cells were collected and resuspended in 300 µL of lysis buffer A (Tris-HCl pH 7.0 10 mM; NaCl 140 mM; MgCl2 1.5 mM; NP-40 0.5%; and deionized water), left on ice for 5 minutes and then centrifuged.The supernatant (i.e., cytoplasm) was used for RNA extraction, while the pellet (i.e., nucleus) was washed three times with 300 µL of lysis buffer A and then resuspended in 300 µL of lysis buffer B (Tris-HCl pH 7.0 10 mM; NaCl 140 mM; MgCl2 1.5 mM; NP-40 0.5%; Tween-20 1%; deoxycholic acid 0.5%; and deionized water).After centrifugation, the pellet was used for RNA extraction.All reactions were carried out on ice and centrifugations at 1,000 g for 3 minutes at 4°C. cDNA of the nuclear and cytoplasmic fractions was analyzed through qRT-PCR to evaluate the percentage of NFYC-AS1 expression in the nucleus and cytoplasm.The nuclear lncRNA MALAT1 and the cytoplasmic protein coding GAPDH were used as positive controls for either fraction.

Coding potential
The coding potential of NFYC-AS1 was investigated through ORF Finder (4), and different alignmentfree tools, including CPAT (5), CNIT (6), and LGC (7), to search for possible open reading frames (ORFs), using NFYC-AS1 reconstructed annotation as input.Additionally, the translatome module of POSTAR (8) was used to assess the translational landscape based on Ribo-seq data of selected cell lines and tissues (Supplementary Figure S3C and Supplementary Tables S6-10).

RNA extraction, retrotranscription, and qRT-PCR
Total RNA was extracted with the miRNeasy Mini Kit (QIAGEN, Hilden, Germany) and DNAse I (QIAGEN, Hilden, Germany), according to manufacturer's instructions.RNA was reverse transcribed into cDNA using the M-MLV Reverse Transcriptase kit (GeneSpin Srl, Milan, Italy).When specifically stated, RNA was reverse transcribed into cDNA using either only random hexanucleotide primers or oligo-d(T)16 primer (Invitrogen™ by Thermo Fisher Scientific Inc., Waltham, MA, USA) with the High-Capacity cDNA Reverse Transcription Kit with RNase Inhibitor (Applied Biosystems by Thermo Fisher Scientific Inc., Waltham, MA, USA).Gene expression level was assessed through qRT-PCR using SsoAdvanced TM Universal SYBR Green Supermix (Bio-Rad, Hercules, CA, USA) and Bio-Rad CFX Connect TM RT-PCR instrument using the ribosomal protein S20 (RPS20) as endogenous gene.Gene expression level was calculated using the ΔΔCt method and expressed as relative quantity (rq).Rq values of each replicate experiment are reported in Original Data 1.Primers were designed with Primer3Plus software (17) and are listed in Supplementary Table S1.To properly measure NFYC-AS1 expression level and given the impossibility to design exon-spanning primers due to NFYC-AS1 monoexonic nature, we designed different primers covering the entire length of the transcript but not overlapping with NFYC (Supplementary Table S1).

Transcript decay
To study the transcript decay, 2 x 10 6 H520 and H82 cells were seeded in 60 mm and 35 mm dishes per time point, respectively.The day after, cells were treated with Actinomycin-D (Sigma-Aldrich, Saint Louis, MI, USA) at a final concentration of 10 µg/mL for 0, 1, 2, 4, and 8 hours.RNA extracted and retrotranscribed into cDNA of treated cells at different time points after recovery was analyzed through qRT-PCR to evaluate the decay of genes of interest.The normalization was done against the 0 h time point for each tested gene and c-MYC was used as a positive control, as it is a shortlived transcript.

Target accessibility analysis
The target accessibility analysis was based on the prediction of the structure of the RNA through Sfold ( 9).The estimated accessibility of each Gapmer-targeted region was calculated as the average of the accessibility percentage of each single base belonging to the NFYC-AS1 portions targeted by the different Gapmers.The NCBI RefSeq NFYC-AS1 annotation with the proximal TSS and the distal TSS was used as input for this analysis.Additionally, RNAfold web service (10) was used to obtain NFYC-AS1 predicted secondary structure.
Cell proliferation assay H520 cells, WT and DTSS H520 clones were seeded in 6-well plates at a density of 9 x 10 4 cells/well, whereas H82 cells were seeded 1.0 x 10 6 cells per T25 flask.The total/viable cell number was counted at 72, 96, and 120 hours after transfection/seeding with the TC20™ Automated Cell Counter (Bio-Rad, Hercules, CA, USA).Data were normalized against NEG/WT at 72h.For the confluency experiment, H520 cells were plated in 60 mm petri at different density, 2.0 x 10 5 , 4.0 x 10 5 , 8.0 x 10 5 , and 12.0 x 10 5 , and collected after 4 days for RNA extraction and qRT-PCR measurement.CCND1 RNA levels were used as marker of proliferation.

Cell cycle FACS analysis
Wild-type H520 cells, non-deleted and deleted clones were harvested, washed with PBS Dulbecco's Phosphate Buffered Saline (Euroclone S.p.A., Pero, Italy) and fixed with 70% ethanol.Cells were then washed twice with PBS and incubated in propidium iodide solution (PI 50 µg/ml, RNAse A 1 µg/ml, Tween 20 0.1%) for 30 min in the dark.DNA content was analyzed by BD Accuri™ C6 flow cytometer and BD CSampler Analysis Software (Becton Dickinson, Franklin Lakes, NJ, USA).At least 30,000 events were read, and cell cycle analysis was performed on ModFIT LT™ software according to the Modfit model (Becton Dickinson).

Survival analysis
We retrieved Progression Free Interval (PFI) time records of TCGA LUSC patients from the UCSC Xena web page (18) and overall survival (OS) of SCLC patients from GSE60052.We employed the Cutoff Finder tool (11) to find the optimal threshold for dichotomization of tumor samples based on the NFYC-AS1 expression and survival data.Survival analysis was performed according to the Kaplan-Meier analysis and a two-sided log-rank test using the survfit function from the survival R package.
The survival curve was plotted using ggsurvplot function from survminer R package.

TransCistor
Transcistor tool (14) was used to investigate the in cis and in trans modes of action of a NFYC-AS1 based on enrichment of targets amongst proximal and distal genes.Input files are DEseq2 list of differentially regulated genes and unchanged genes in each Gapmers (GAP2-GAP4) versus the Negative Control Gapmer, and in DTSS clones versus WT clones.

Immunoblotting analysis
For total protein extraction, cells were lysed in RIPA buffer (10mM Tris-HCl pH 8.0, 1 mM EDTA, 0.  S13).Incubated membranes were washed three times in TBS-T for 10 minutes with shaking at room temperature and incubated in the same conditions with secondary antibody for 1 hour.Peroxidase-conjugated secondary antibodies (Sigma-Aldrich, Saint Louis, MI, USA) were prepared 1:10,000 in 5% skimmed milk solution in TBS-T.Membranes were washed three times in TBS-T for 10 minutes with shaking at room temperature and briefly incubated in peroxidase solution (1:1 of Peroxidase and Enhancer Solution purchased by GeneSpin Srl, Milan, Italy).Immunoreactive bands were visualized using Bio-Rad ChemiDoc MP, with ImageLab software.Vinculin was used as housekeeping protein for normalization and quantification of bands intensity using ImageJ software.Full blots are displayed in Original Data 2.

Analysis of RNA-seq data
Differential expression analysis for each Gapmer compared with NEG and for DTSS clones compared with WT clones was performed using the R package DESeq2.Gene set enrichment analysis was performed on genes ranked by DESeq2 stat values by GSEA Pre-ranked (12) using GSEA Hallmark collection.The common enriched pathways for all the three Gapmers were considered and the heatmaps and bubble plot of the significant normalized enrichment scores (NES) when the false discovery rate (FDR) was lower than 0.10 for all the three Gapmers and clones were plotted using ComplexHeatmap and ggplot2 R packages, respectively.The fold-change (log2FC) of common leading-edge genes of UV response DN and G2/M checkpoint gene sets (Supplementary Table S12) were plotted together with the relative tumor/normal ratio (expressed as log2FC) in LUAD, LUSC (TCGA) and SCLC (GSE60052) cohorts.Moreover, GSEA C2 Reactome collection was tested on ranked gene lists of all Gapmers versus NEG and DTSS versus WT clones, and the relative bar plot of significant NES (FDR<0.05 in all the three Gapmers and clones) was plotted using the R software.Ranked gene list for Gapmers versus NEG was tested on GSEA C2 collection (specifically FISCHER G1/S CELL CYCLE, FISCHER G2/M CELL CYCLE, WHITFIELD CELL CYCLE G1/S, and WHITFIELD CELL CYCLE G2/M gene sets) and on other custom gene sets, including lung cancer vulnerabilities as from Sauta E et al. (19), and genes synthetic lethal in RB1-mutated cells as from Oser MG et al. (20).Moreover, we defined lists of confident targets for activator E2F factors, repressor E2F factors, FOXM1, MYBL2, and NF-Y, all known to take part in the transcriptional control of the different cell cycle phases (21)(22)(23)(24).We downloaded ChIP-seq data for these factors from ENCODE (FOXM1: GSE209315; MYBL2: GSE170799; NF-Y: GSE231181, GSE170375, GSE170694, GSE96244; E2F1/2/3: GSE231164, GSE169862, GSE169833, GSE169964; E2F4/5: GSE231060, GSE170651, GSE230897, GSE170241) (25), which were then aligned to NCBI RefSeq (GRCh38/hg38) annotation in order to get the list of genes having a peak in the promoter region.We also retrieved genomic positions of the consensus motif of these factors in the promoter region using findMotif tool downloaded from the UCSC Genome Browser Downloads page (26).Thus, confident targets were defined by the presence of ChIP-seq peak and the specific consensus motif in the promoter region, always defined as -1,000/+1,000 form the TSS.We used TTTSSCGC consensus motif for activator E2F1/2/3, TTTSSCGC and/or CHR (TTYRAA) consensus motif for repressor E2F4/5, CHR (TTYRAA) consensus motif for FOXM1 and MYBL2 (21,22), and CCAAT consensus motif for NF-Y (27).Target gene lists of these factors (reported in Supplementary Table S14) were used for GSEA analysis using ranked gene list for Gapmers versus NEG.
Boxplots of differential expression of NFYC in normal and tumor tissues (TCGA (2)).B ROC curves to test normal-tumor classification according to NFYC-AS1 expression in different cancer types (pancancer), LUAD or LUSC (TCGA).The area under the curve (AUC) and the confidence interval are reported in the figure.For a good classifier: 0.5 < AUC <1.C Boxplots of NFYC-AS1 expression level in normal tissues and in LUAD and LUSC NSCLC histotypes (GSE81089).D Boxplots of NFYC-AS1 expression level in normal curves: RB1-wt and RB1-mutated cancer Lung cancer cells (CCLE) Characterization of NFYC-AS1 knockdown phenotype by Gapmer ASOs.
p-values of the one sample t-test are reported.E Overall survival curve with stratification according to Cutoff Finder (11) -determined threshold (29.29) for NFYC-AS1 in SCLC patients (GSE60052).p-value calculated using the log-rank test and the hazard risk (HR) are reported in the figure.*p<0.05,**p<0.01,***p<0.001,****p<0.0001,ns (non-significant).Supplementary Figure S5: Characterization of NFYC-AS1 knockout phenotype by CRISPR/Cas9 editing.A Electrophoretic gel of the PCR products amplified with the genomic PCR primers in WT and ΔTSS H520 clones.B Comparative bar plot of significant NES (FDR<0.05) of GSEA (12) C2 Reactome gene sets run on RNAseq of Gapmer-treated H520 and of ΔTSS clones.
LUSC molecular subtypes (TCGA).Frequency of RB1 mutation for LUSC molecular subtypes (TCGA) is reported on the graph.F Boxplots of NFYC-AS1 expression level in SCLC cell lines classified by their molecular subtype (CCLE (3)).G ROC curves to test RB1wildtype/-mutated classification according to NFYC-AS1 expression in different NSCLC tissues (TCGA) and lung cancer cell lines (CCLE).The area under the curve (AUC) and the confidence interval are reported in the figure.For a good classifier: 0.5 < AUC <1.Throughout the figure, the gene expression level is expressed as logarithm in base 2 of the normalized counts (norm) or TPM plus one [log2(norm or TPM+1)].The tumor-normal fold-change (FCT/N) is calculated as the ratio between the average normalized counts or TPM for NFYC-AS1 in tumors and in normal tissues.p-values of the two-tailed unpaired t-test are reported, *p<0.05,**p<0.01,***p<0.001,****p<0.0001,ns (non-significant).[PI (proximal inflammatory), PP (proximal proliferative), TRU (terminal respiratory unit)].